A Systematic Literature Review of the Relationship between Serum Ferritin and Outcomes in Myelodysplastic Syndromes

Anemia is the most common form of cytopenia in patients with myelodysplastic syndromes (MDS), who require chronic red blood cell transfusions and may present high serum ferritin (SF) levels as a result of iron overload. To better understand the potential effects of high SF levels, we conducted a systematic literature review (SLR) to identify evidence on the relationship between SF levels and clinical, economic, or humanistic outcomes in adult patients with MDS. Of 267 references identified, 21 were included. No studies assessing SF levels and their relationship with humanistic or economic outcomes were identified. Increased SF levels were an indicator of worse overall survival and other worsened outcomes; however, the association was not consistently significant. SF levels were a significant prognostic factor for relapse incidence of MDS and showed a significant positive correlation with number of blood units transfused but were not associated with progression to acute myeloid leukemia or the time to transformation. Higher SF levels were also an indicator of a lower likelihood of leukemia-free survival, relapse-free survival, and event-free survival. The SLR suggests that SF levels are associated with clinical outcomes in MDS, with higher levels correlated with number of blood units transfused, frequently indicating worse outcomes.


Introduction
Myelodysplastic syndromes (MDS) are a heterogeneous group of clonal disorders in hematopoietic stem cells, characterized by ineffective hematopoiesis resulting in abnormally low levels of normal red blood cells (RBCs), white blood cells, platelets, or combinations of these cells [1]. Although patients also suffer from an increased risk of infection or hemorrhage and may even progress to acute myeloid leukemia (AML), anemia is the most common form of cytopenia in MDS [2]. To overcome anemia, patients with MDS require chronic RBC transfusions [3], often resulting in iron overload [4,5], which is reflected by high serum ferritin (SF) levels.
While the gold standard of measurements of iron burden is T2* magnetic resonance imaging (MRI) of the heart and R2* MRI of liver iron concentration (LIC) [6], the SF level marker is an acceptable surrogate marker that is universally available worldwide and especially meaningful [7]. High SF levels indicative of iron overload are generally considered to be >1000 µg/L, which is associated with harmful consequences such as organ damage and increased mortality [8]. The risks of cardiac events and hepatic complications are also increased by iron overload, so this complication of regular transfusions is considered to be an independent prognostic variable of overall survival (OS) [9].
To explore the relationship between SF levels and outcomes, we performed a systematic literature review (SLR) to collect the available evidence on the relationship between SF levels and clinical, economic, and humanistic outcomes in patients with MDS.

Methods
A systematic search and analysis of the literature was undertaken to identify evidence on the relationship between SF levels and the clinical efficacy, safety, and the economic and humanistic burden of illness in patients with MDS. The SLR was conducted according to the standards set forth by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [10,11] and the Cochrane Handbook for Systematic Reviews of Interventions [12].
Searches were developed according to established guidelines [10][11][12][13] to identify studies of interest in Embase and MEDLINE and MEDLINE In-Process (both via Ovid); search strategies included a combination of free-text searches and controlled vocabulary terms (Supplementary Materials Tables S1 and S2). Proceedings from conferences of the American Society of Hematology, the European Hematology Association, and the International Society for Pharmacoeconomics and Outcomes Research were searched for relevant abstracts to identify submissions from 1 January 2018 to 23 April 2020. Bibliographies of systematic reviews and/or meta-analyses reporting SF levels in patients with MDS published since 1 January 2018 were also used to identify additional relevant publications.
Predefined inclusion and exclusion criteria (Table 1) were used to evaluate the titles and abstracts of records identified from the searches, and full-text articles of the abstracts deemed relevant were retrieved and examined. Studies that failed to meet the inclusion criteria or were ineligible for inclusion were rejected, and reasons for rejection were captured. Studies were required to report on the association of SF levels with outcomes of interest in patients with MDS, with the relationship evaluated via univariate or multivariate models. All screening was conducted by two independent investigators; screening decisions required agreement between the two, and any disagreements were resolved by a third investigator. Data extraction was performed by one researcher for studies meeting all inclusion criteria and validated by a second researcher, with discrepancies resolved by a third. Risk of bias in the included studies was assessed via the Quality in Prognostic Studies (QUIPS) tool [14].

Results
Searches were conducted on 23 April 2020, returning 362 references. From these, 95 duplicates were removed, and 267 abstracts were screened. Among them, 61 references were evaluated at the full-text level, 41 of which were excluded. Supplementary searches of conference presentations identified an additional reference, resulting in 21 studies eligible for inclusion in the SLR (Figure 1) .

Results
Searches were conducted on 23 April 2020, returning 362 references. From these, 95 duplicates were removed, and 267 abstracts were screened. Among them, 61 references were evaluated at the full-text level, 41 of which were excluded. Supplementary searches of conference presentations identified an additional reference, resulting in 21 studies eligible for inclusion in the SLR (Figure 1) . Risk of bias assessed via the QUIPS tool indicated that studies were generally of good quality with low risk of bias. Baseline characteristics were inconsistently reported across five studies [15,16,21,25,34], and three studies were unclear in their information on variables in multivariate models [19,21,30]. Despite these minor gaps in information, studies consistently reported other elements (study attrition, prognostic factor, outcome measurement, and statistical discussion), resulting in a low risk of bias across the remaining categories (Supplementary Materials Table S3).
Of the 21 observational studies identified , most (n = 15) examined retrospective cohorts, while six reported on prospective cohorts. Geographic locations across studies varied: three studies in Japan [22,23,27]; two each in the United States [26,29], Canada [30,35], the Czech Republic [16,21], and Turkey [15,32]; and one each from China [24] and New Zealand [20]. The remaining data were from a range of European countries. When the study setting was reported, data were measured in hospital settings, outpatient settings, cancer centers, and hematology-specific sites and centers. Most studies (n = 17) evaluated at least 50 patients, although the study populations ranged from 35 patients in two studies [15,19] to 419 in a cohort of patients treated at four hematologic centers in Austria ( Table 2) [33]. Risk of bias assessed via the QUIPS tool indicated that studies were generally of good quality with low risk of bias. Baseline characteristics were inconsistently reported across five studies [15,16,21,25,34], and three studies were unclear in their information on variables in multivariate models [19,21,30]. Despite these minor gaps in information, studies consistently reported other elements (study attrition, prognostic factor, outcome measurement, and statistical discussion), resulting in a low risk of bias across the remaining categories (Supplementary Materials Table S3).
Of the 21 observational studies identified , most (n = 15) examined retrospective cohorts, while six reported on prospective cohorts. Geographic locations across studies varied: three studies in Japan [22,23,27]; two each in the United States [26,29], Canada [30,35], the Czech Republic [16,21], and Turkey [15,32]; and one each from China [24] and New Zealand [20]. The remaining data were from a range of European countries. When the study setting was reported, data were measured in hospital settings, outpatient settings, cancer centers, and hematology-specific sites and centers. Most studies (n = 17) evaluated at least 50 patients, although the study populations ranged from 35 patients in two studies [15,19] to 419 in a cohort of patients treated at four hematologic centers in Austria (Table 2) [33].     Study population mean ages ranged from 49.4 to 69.8 years, and median ages from 50 to 77 years (Table 2). Most studies included between 47% and 70.6% male participants, though one subgroup of 10 patients with high SF levels included 90% male participants. None of the studies included information on race, and only one study presented information on the ethnicity of the patient population [20]. Time since diagnosis or time between diagnosis and treatment were infrequently reported and ranged from a median of 8.0 to 16.5 months in the three studies reporting those data [17,26,31]. Only nine studies reported information on transfusion dependence or whether patients required transfusions [15,18,21,23,[28][29][30][31]34]. Patient populations and inclusion criteria varied; values ranged from 0% of patients who were transfusion dependent in a French registry study in non-transfusion dependent patients [28] to 69% of patients requiring a transfusion in a retrospective cohort study [29]. SF levels at baseline varied within and across studies.
Disease status was reported using several indicators: International Prognostic Scoring System (IPSS) and revised International Prognostic Scoring System (IPSS-R) risk groups, as well as French-American-British (FAB) and World Health Organization (WHO) risk groups. The included studies reported both univariate and multivariate models, with six studies using both to analyze the association between SF levels and outcomes. Studies used a variety of statistical approaches to assess the prognostic value of SF levels, including Cox proportional hazards, Pearson product-moment correlation coefficient or Spearman's rho test, Wilcoxon's test, Mann-Whitney U test, Chi-squared tests, and regression analyses. Variables for which the models controlled were inconsistently reported. When reported, these varied across studies and included age, sex, IPSS subgroup, disease stage or progression, and bone marrow (BM) blasts. A variety of outcomes were predicted using these models. Outcomes were limited to clinical outcomes; no data were identified on economic or humanistic outcomes associated with SF levels.
Findings from the included studies suggested that higher SF levels were associated with reduced survival, although it was difficult to identify strong patterns across the 16 studies. Eleven studies demonstrated the prognostic value of SF levels in OS, with higher SF often indicating worse OS [16][17][18][21][22][23][24]26,27,30,33]. Patients with SF levels >210 ng/mL had significantly poorer survival compared with that of patients with SF levels above that limit (hazard ratio [HR]: 2.14; 95% confidence interval [CI]: 1.02-4.50; p = 0.044) [22]. Higher SF levels measured on a continuous scale were a significant predictor of worse OS in a study of a cohort of 419 patients with primary MDS who were treated at one of four Austrian hematologic centers [33]. For the overall cohort, a log scale increase in SF levels was associated with worse OS (HR: 2.2; p < 0.01). Among a subgroup of patients, those with Lowor Intermediate (Int)-1-risk MDS experienced a greater reduction in OS (HR: 2.5; p < 0.01). The authors noted that during the other analyses conducted in the study, the hematopoietic stem cell transplant (HSCT)-specific comorbidity index and the Charlson Comorbidity Index were independent predictors of worse OS in the overall cohort. However, when SF levels were incorporated into the model, neither remained an independent predictor of OS. The authors suggested that SF levels may lead to organopathy and comorbidities, further complicating the assessment of SF level as a prognostic factor [33].
Five studies found no association between higher SF levels and reduced OS [16,28,29,33,35]; however, when examining specific patient subgroups, two studies did find such a relationship [16,33]. SF levels <2000 µg/L were associated with better survival outcomes among non-transplanted patients, but not among the overall study population [16]. A second study of a French registry, evaluating patients with lower-risk, non-transfusion-dependent disease, found no relationship between SF level and OS when SF levels were >300 ng/mL or >1000 ng/mL (p = 0.98 and p = 0.67, respectively, in univariate analyses). The study reported that the five-year OS for patients with SF levels < 300 ng/mL was 62% compared with 64% in patients >300 ng/mL. Similarly, in patients with SF > 1000 ng/mL, the fiveyear OS was 67% compared with 62% in patients with levels < 1000 ng/mL [28]. The studies used various thresholds to define when SF levels were elevated. In one study, patients with SF level > 400 ng/mL had a median OS of 24.2 months compared with a median 42.6 months in patients with SF levels < 400 ng/mL (p = 0.003) [18]; a separate study also used this threshold to define elevated SF levels and found similar significant results (mean survival for SF level < 400 ng/mL: 77.2 months; mean survival for SF level ≥ 400 ng/mL: 44.2 months; p = 0.001) [32]. In a Japanese study of patients with MDS who did not receive iron chelation therapy (ICT), 5-Aza-2 -deoxycytidine (5-AZA), or HSCT, patients with SF levels ≥500 ng/mL had an HR of 10.7 (95% CI: 2.375-48.23; p = 0.002) and an OS of 10.2 months, compared with 118.8 months for patients with SF levels < 500 ng/mL (p = 0.001) [23]. A retrospective review of registry data from Poland evaluated the relationship of SF with multiple outcomes, including worsened survival [34]. In the univariate analysis, the study reported an increased risk of worsened survival for patients with SF levels > 1000 ng/mL, with a significantly increased chance of experiencing the outcome (HR: 2.94; p = 0.0023).
Though several studies included patients who underwent HSCT, only two assessed transplantation/treatment-related mortality (TRM) or NRM, defined as death from causes other than a relapse of MDS [17,26]. The first study analyzed a US-based cohort and sought to determine whether the disease characteristics at diagnosis of MDS and at the time of HSCT affected patient outcomes; in the cohort, more than 40% were in the High-or Very High-risk IPSS-R category. In the univariate analysis, SF levels >1130 µg/L at the time of HSCT indicated a significantly increased risk of transplantation-related mortality (HR: 2.0; p = 0.009). Upon multivariate analysis (controlling for age, donor type, conditioning intensity, and transplantation year), the association was no longer significant, although the trend was similar (HR: 1.7; p = 0.06) [26]. A second study that evaluated patients with MDS who underwent HSCT examined TRM. While this was not defined explicitly, it was evaluated as being the same as "transplantation-related mortality." Among patients with MDS who underwent HSCT, continuous SF in units of 1000 ng/mL was assessed as a potential prognostic factor for NRM; in the multivariate analysis (controlling for age, sex, transfusions, comorbidities, C-reactive protein levels, WHO classification, and conditioning regimen at HSCT), there was no significant relationship between increase in SF levels and the risk of NRM (HR: 1.1; 95% CI: 0.8-1.4; p = 0.06) [17].

Progressive Disease and Relapse
Across the eight studies identified by the SLR that reported on progressive disease (PD) or relapse-related outcomes [17,21,23,26,28,30,33,34], there was a general trend for higher SF levels to predict worse outcomes. However, there was limited evidence on each of these outcomes; three studies reported event-free survival (EFS) [21,26,33], two studies relapsefree survival (RFS) [17,30], two studies relapse incidence [17,26], one study leukemia-free survival (LFS) [23], and two studies transformation to AML/time to transformation [28,34]. Multivariate models controlled for different covariates of interest across studies, including age and sex, as well as histological subtype, BM blast counts, karyotypes, and conditioning regimen at the time of HSCT; these variables were not reported for all studies. Results are shown in the Supplementary Materials (Table S5).

Event-Free Survival
In the three studies reporting EFS, an event was defined as PD or death. Definition of PD included transformation to AML, an increased number of BM blasts, a higher degree of cytopenia, or advancing in FAB classification. Overall, these studies suggested that higher SF levels indicated worse EFS, but the association between SF level and EFS was not demonstrated consistently. Specifically, higher SF levels were significantly associated with worse EFS in two studies using multivariate models, with SF analyzed in a continuous manner (HR: 1.14; p = 0.001 [21] and HR: 2.00; p < 0.01 [33]). The same trend was demonstrated in a subgroup of Low-and Int-1-risk patients through a multivariate model (HR: 2.9; p < 0.01) [33]; however, in a subgroup of Int-2-or High-risk patients in the same study, higher SF levels were not significantly associated with worse EFS, and the trend was much less pronounced (HR: 1.2; p = not reported [33]). A third study evaluating patients who underwent HSCT found that SF levels at the time of transplant were significantly associated with worse EFS in univariate and multivariate models [26]. When patients had an SF threshold >1130 µg/L, EFS was significantly worse (HR: 1.6; p = 0.01). When this threshold was increased to 1150 µg/L, the HR increased as well (HR: 1.8; p = 0.002). The study also compared patients with missing SF levels to those with SF ≤1130 µg/L (univariate; HR: 1.5; p = 0.05) and SF ≤1150 µg/L (multivariate; HR: 1; p = 0.9); none of these comparisons indicated a significant relationship between SF and EFS, though missing SF level data are likely not an informative subgroup for general comparison [26].

Relapse-Free Survival
In general, lower SF was numerically associated with better RFS in the two studies reporting on this outcome, but the association was not always significant. The two studies evaluated patients who underwent HSCT and reported on the association between pre-transplant SF levels and RFS, and neither study provided a clear definition of "relapse" [17,30]. This relationship was not significant in a multivariate model assessing pre-transplant SF as a continuous measure in units of 1000 ng/mL (HR: 1.2; 95% CI: 0.98-1.4; p = 0.08) [17]. However, in the study comparing pretransplant SF levels ≤1000 ng/mL to >1000 ng/mL, the univariate and multivariate analyses found that a lower SF level (below the threshold) was significantly associated with better RFS (HR: 1.931; 95% CI: 1.239-30.10; p = 0.0037 in univariate analyses and HR: 1.799; 95% CI: 1.147-2.823; p = 0.0106 in multivariate analyses) [30].

Relapse Incidence
The two studies reporting on relapse incidence offered conflicting results. The first study, a US-based retrospective analysis of a cohort of patients who underwent HSCT, defined relapse as "a hematologic recurrence of MDS according to standardized criteria"; SF was not a predictor of relapse incidence in the univariate analysis. When using SF ≤1130 µg/L as a reference, patients with SF levels above that threshold had an HR of 1.0 (p = 0.8), while patients with missing SF data showed a slight trend towards greater relapse, although not significant (HR: 1.7; p = 0.06) [26]. In the second study, a prospective analysis of patients who underwent HSCT, the multivariate analysis of SF in a continuous manner (in units of 1000 ng/mL) indicated that pre-transplant SF levels had a small but significant association with incidence of relapse after HSCT (HR: 1.3; 95% CI: 1.01-1.6; p = 0.04) [17]. Relapse was not defined in that study.

Leukemia-Free Survival
SF level was associated with worse likelihood of LFS in a retrospective analysis of a cohort of patients from Japan who did not receive ICT, 5-AZA administration, or a stem cell transplant [23]. The univariate logistic regression analysis suggested that SF levels ≥500 ng/mL were significantly associated with poor LFS, albeit with a wide confidence interval (HR: 21.16; 95% CI: 2.062-217.1; p = 0.01). However, the significance of the association was not maintained when the threshold was set at ≥300 ng/mL (HR: 4.752; 95% CI: 0.852-26.51; p = 0.076).

Transformation to AML
Two studies evaluating the relationship between SF and progression to AML, or time to progression to AML, could not find a statistically significant association [28,34]. Both studies presented results from univariate models. While neither study reported an effect size, they both found that there was no association with SF levels >1000 ng/mL and transformation to AML (p = 0.47 [28] and p > 0.05 [34]), as well as with a lower threshold of SF >300 ng/mL (p = 0.94) [28]. SF levels >1000 ng/mL were not shown to be associated with time to transformation to AML when the outcome was evaluated in a cohort from the MDS-Polish Adult Leukemia Group (PALG) registry (p = 0.35) [34].

Additional Outcomes
Five studies reported on treatment response, medication adherence, blood units transfused, and liver stiffness measurements (Supplementary Materials, Table S6) [15,19,20,25,31]. A retrospective cohort study in Turkey suggested that lower SF levels at the time of MDS diagnosis were significantly associated with better treatment response (p = 0.004). Treatments received by patients within this study were antithymocyte globulin and prednisolone, thalidomide and or lenalidomide, 5-AZA, or thalidomide and 5-AZA [15]. When adherence was evaluated, deferasirox-adherent patients had statistically significantly lower SF compared with that of nonadherent patients (r = −0.288; p = 0.004) [19]. Two studies noted a positive correlation between SF levels and the number of blood units received, with increases in SF correlated with increases in blood units [20,25], but only one reported it as significant (coefficient: 0.52; p = 0.04) [25]. Finally, in a study evaluating a potential tool to assess liver fibrosis in MDS, the univariate analysis indicated that there was no association between higher SF levels (defined as >320 µg/L in men and >161 µg/L in women) and higher liver stiffness measurements (p = 0.583) [31].

Discussion
This SLR on the association between SF levels and outcomes of interest in patients with MDS identified a breadth of clinical findings from a variety of research settings, although none of the included studies presented relevant results on the impact on humanistic or economic outcomes. These studies suggested that SF levels can serve as a prognostic factor; however, the variation in SF level thresholds used when measuring the same outcome, descriptions of what dictates "higher" SF levels, and the lack of consistently significant results (often for the same outcome, at times within the same study) highlight the difficulty in its broad application. Studies overwhelmingly reported on survival-related outcomes, and the results generally suggested that increased SF levels were associated with worse survival outcomes (whether for OS, EFS, or RFS). Despite this, there was no obvious association between SF levels and the incidence of relapse or PD. This may be due in part to the small body of evidence identified; while a large proportion of studies reported that SF levels were associated with worse OS, few studies reported on EFS or RFS, limiting the opportunity to identify trends across these outcomes. Contrasting results, across and within studies, complicated the comparisons, likely due to differing model variables.
Subgroup comparisons, where reported, also provided challenges resulting from the nature of MDS. For example, conflicting EFS results between a subgroup of Low-and Int-1-risk patients and a subgroup of Int-2-and High-risk patients underscored the potential interaction between SF and comorbidities or risk classification, even when such variables were controlled for. At the same time, however, results illustrated that SF was associated with worse survival outcomes in patients with Low-and Int-1-risk classification, suggesting that the measure retains value as a prognostic tool in these patients.
Generally, there was a low risk of bias observed across the studies included in the SLR. While the review itself was subject to the same limitations as all SLRs (namely, publication bias), its reproducible study design and limited opportunity for bias, demonstrates a high degree of rigor. The findings of this review were also limited by the number of included studies reporting univariate analyses only, and very few outcomes had findings confirmed across univariate and multivariate analyses. The results were also somewhat limited due to fact that no data on economic or humanistic outcomes were identified. Future research is needed to fill this data gap to properly explore the relationship between economic or humanistic outcomes and SF levels. Given the impact that increased SF levels has on survival, it would be important to understand the ways in which SF levels impact other facets of the lives of patients with MDS.
With chelation necessary to help address high SF levels and the physical burden of iron overload, a separate SLR was undertaken to review SLRs reporting on the burden of ICT in patients with MDS. Few data were systematically identified, and those that were identified focused primarily on the survival benefits conferred by ICT. Subsequent targeted searches were undertaken to fill data gaps, with evidence suggesting that the use of ICT was costly to patients but did not provide a significant added benefit in quality of life. Patients with MDS receiving ICT had demonstrably worse quality of life compared with that reported for the general population [36]. Similarly, treatment with ICT over time did not provide meaningful improvement in quality of life [37]. In addition, patients experienced a substantial cost burden resulting from their ICT treatment, with 10-year systems-level costs reaching as high as USD 2 million with deferasirox [38]. The evidence in that companion review suggests that where high SF levels require ICT, clinical, humanistic, and economic outcomes are affected. Additional studies exploring the relationship between SF levels and outcomes would likely inform the burden of ICT use, as well.

Conclusions
The range of evidence identified by this SLR suggests that higher levels of SF are potentially a prognostic indicator for decreased survival in patients with MDS. Additional research using consistent study methods that reduce variation in how results are presented would allow for improved comparisons across studies and populations. Further evidence is needed to determine whether SF levels are associated with economic or humanistic outcomes in these patients. Author Contributions: E.N.O., K.H., D.T., A.Y., C.H. and F.S. conceived and designed the study and interpreted the data; M.C. and E.S. collected and analyzed the data; S.D. and M.T. conceived and designed the study, and collected, analyzed, and interpreted the data. All authors reviewed the different drafts and approved the final manuscript for submission. The authors are fully responsible for all content and editorial decisions for this manuscript. All authors have read and agreed to the published version of the manuscript.

Funding: This study was funded by Bristol Myers Squibb.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data that support the findings of this study are available in the Supplementary Material of this article and from the corresponding author upon reasonable request.
Celgene International Sàrl, a Bristol-Myers Squibb Company, Chiesi Farmaceutici S.p.A, Novartis Pharma AG, Silence Therapeutics Plc, and Vertex Pharmaceuticals Inc. outside the submitted work.